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What is Ambisonics? 


• Extensible, hierarchical system for representing sound 
fields 

• Says how something should sound, rather than specific speaker 
signals. 

• Capture or creation 

• Microphone arrays 

• 2-D or 3-D 

• Natural B-fornnat, Tetrahedral, Spherical arrays 

• Ambisonic Fanners 

• Reproduction 

• 2-D, “horizontal” or 3-D “with height” loudspeaker arrays 

• “Any” size or shape array of loudspeakers 
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What is an Ambisonic Decoder? 


• In Ambisonics, the program format is independent of the 
reproduction layout. 

• The decoder’s task is to create the best perceptual 
impression possible that the sound field is being 
reproduced accurately, given the resources available 

• Bandwidth, number of speakers, configuration of speakers ... 

• We use the term “decoder” to mean the configuration for a 
decoding engine that does the actual signal processing 
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Goals for decoder design 

• Mimic conditions of natural hearing 

• Constant amplitude gain for all source directions (P) 

• Constant energy gain for all source directions (£) 

• At low frequencies, correct reproduced wavefront direction and 
velocity (r^^) 

• At high frequencies, maximum concentration of energy in the 
source direction (r^) 

• Matching high- and low-frequency perceived directions 

• Getting correct is the most difficult aspect 

• Recent work shows that it is also the most important! 
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Designing Decoders 

• Decoders for regular polygon and polyhedra loudspeaker 

arrays are easy to design 

• Build the speaker encoding matrix, K, by sampling the spherical 
harmonics at the speaker directions 

• Use pseudoinverse to find the basic decoding matrix M 

• rE guaranteed to point in same direction as rV 

• However... 

• Room geometry or visual considerations often limit speaker 
placement 

• 3-D HOA requires placing more speakers above and below the 
listener 
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Tradeoffs 

• Once we deviate from regular geometry 

• we must trade off localization accuracy for uniform loudness 

• Directions of rE and rV are not the same 

• Localization degrades outside the area with a high density 
of loudspeakers 

• Gerzon used nonlinear optimization for this 

• Many implementations: Wiggins, Moore & Wakefield, Tsang, BLaH 

• Works well for small arrays (e.g., ITU 5.1) 

• Convergence is slow for large HOA arrays (hrs) 

• IDHOA (Scaini and Arteaga) looks promising 

• Better objective function and zero out small coefficients 
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New Strategies in Toolbox 

• Use an inversion technique suited to ill-conditioned 
matrices 

• Constant energy decoder 

• Truncated SVD 

• Energy limited 

• Invert a well-behaved full-sphere virtual speaker array, 
map to a real array 

• Hybrid Ambisonic-VBAP 

• AIIRAD (Zotter and Frank) 

• Derive a new set of basis functions for which inversion is 
well behaved 

• Spherical Slepian Functions 

• EPAD (Zotter, Pomberger, Noisternig) 
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Are these decoders Ambisonic? 

• Ambisonic theory specifies performance goals, not how to 
design a decoder 

• We use the same criteria for these decoders 

• But... 

• Apply them only to source directions in the covered part of the sphere 

• Require them be “well behaved” in other directions 



(a) re vs. Test Direction (b) rg Direction Error (degrees) (c) Energy Gain (dB) 
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CCRMA Listening Room 


• 22 identical loudspeakers in 
five rings 

• Horizontal ring of 8 
loudspeakers 

• 2 rings of 6 loudspeakers, 
one 50° below horizontal and 
one 40° above 

• 1 loudspeaker at each pole 

• Array is almost regular 

• Upper 15 used for 
hemispherical dome 

• Full-sphere decoder 
described in our LAC2012 
paper 
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AIIRAD - Hybrid Ambi-VBAP 



• 240 point spherical 
design for virtual 
speaker array 

• Dome of upper 15 
loudspeakers of 
CCRMA Listening 
Room, 8-6-1 

• Imaginary speaker 
at bottom 

• Design procedure 
detailed in paper 
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AIIRAD performance 


mag and dir of rV 


rV angular error (degrees) 


mag and dir of Pressure gain 





magnitude of rV vs. test direction 
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Pressure gain vs. test dir 
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AIIRAD performance 



magnitude of rE vs. test direction 
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mag and dir of Energy gain 
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AIIRAD r,/direction grid 

CCRMA Listening Room Dome 3h3p aiirad 240 rE max 
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AIIRAD direction grid 

CCRMA Listening Room Dome 3h3p aiirad 240 rE max 



Azimuth (deg ccw, 0-front) 
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CCRMA Listening Room Dome 3h3p allrad 240 rE max 

rV rE Direction Difference 
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Spherical Slepian Functions 

• Linear combinations of spherical harmonics 

• Produce a new set of basis functions that are zero outside 
the region of interest on the sphere 

• Remain orthogonal within the region 

• Used in satellite geodesy to model earth’s gravitational 
and magnetic fields from incomplete data 

• In Ambisonic decoding, we can specify a region of the 
sphere, a dome or a ring, and derive a well behaved set of 
basis functions for that region. 

• Design procedure detailed in paper 
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3^ order spherical harmonics (blue = inverted polarity) 
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3^ order spherical Slepian functions for +90° to -30° dome (first 13 used for decoder) 








A9=0.9294 


A2=1.0000 


Aio = 0.9294 


A3=1.0000 


Aii=0.6795 


Ai2 = 0.5971 


A5=0.9981 


Ai3 = 0.5971 


Ai4=0.1716 


A7=0.9576 


Ai5 = 0.1716 


A8=0.9576 


Ai6=0.0148 


Ai=1.0000 


Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany 


























elevation (deg) 


20 


Spherical Slepian performance 


mag and dir of rV 


rV angular error (degrees) 


mag and dir of Pressure gain 





magnitude of rV vs. test direction 
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Spherical Slepian performance 



mag and dir of Energy gain 
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Spherical Slepian direction grid 

CCRMA Listening Room Dome 3H3P SlepianIS 


rV Direction 



Azimuth (deg ccw, 0-front) 
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Spherical Slepian direction grid 

CCRMA Listening Room Dome 3H3P SlepianIS 
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3"^ order Hybrid Ambi-VBAP (AIIRAD) 
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In situ performance measurements 



• Dummy head and reference 
Omni 

• Dome array using upper 15 
speakers in CCRMA’s 
listening room (8-6-1) 


Tested 

• All RAD Dome 

• Spherical Slepian Dome 

• Full-sphere (from LAC2012) 

Collected 

• individual speaker IRs 

• Ambisonically panned IRs at 
10° azimuth, 30° elevation 
intervals for each decoder 

Analyzed horizontal data 

• 250 Hz ITD (ry) 

• 1 to 4 kHz ILD (r^) 
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ITD and ILD measurements 


250 Hz ITD 



1-4 kHz ILD 



Observations 

The measured ITDs were 
similar with the three 
decoders but ILDs were 
very different 

This supports the subjective 
observations that the three 
decoders sound different 

Detailed analysis is pending 
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Informal listening tests 

• 3''‘^-order test programs 

• Full-sphere mix of “Babel” by Allette Brooks (Jay Kadis) 

• Chroma XII by Rebecca Sanders (Jorn Nettingsmeier) 

• Both dome decoders sounded good subjectively (but 
different!) 

• Compact and directionally accurate localization down to horizon 

• Faded below horizon 

• SSF decoder sounded brighter and more detailed than AIIRAD 

• Neither decoder sounded as good the full-sphere 
reference decoder 

• 1®‘-order orchestral recording not reproduced well 

• Most of orchestra is below the horizon 
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Decoding Engine 

• New decoding engine written in FAUST 

• No inherent limit on order 

• Dual band, NFC filters, distance compensation, ... 

• Toolbox writes out configuration section, appends 
implementation 

• Compiles to LADSPA, LV2, Pd, Supercollider, VST, AU ... 

• Can be used independently of toolbox 

• Drawback: Configuration “baked into” plugin 

• Toolbox also writes out configuration files for 

• Kronlachner’s ambiX plugin suite 

• Adriaensen’s Ambdec 
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Implementation 

• Toolbox runs in MATLAB and GNU Octave 

• Implements all known channel ordering and normalization 
conventions: both mixed-order conventions (HP and HV) 

• No inherent limit on Ambisonic order 

• Actively in use by a few beta testers 

• Mixed results for graphics output in Octave 

• Moving graphics output code to Python with MayaVi 

• Interface to IDHOA optimizer 

• GNU Affero General Public License 

• Faust decoder engine BSD 3-Clause License 

• Git repo at https://bitbucket.org/annbidecodertoolbox/adt 
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Summary and Conclusions 

• Extensions to Ambisonic Decoder Toolbox to handle speaker 
configurations that do not cover full sphere 

• New decoder engine in written in Faust 

• Ability to generate decoders quickly has proven valuable In 
performance settings 

• Plans 

• Dual-band AIIRAD and Slepian decoders 

• Optimizer to refine decoders 

• Open question: 

• What to do when sources move into areas of poor coverage. 

• Current implantation fades them out. 

• Decorrelate and mix into other speakers? 

• Should transmission standards include “rendering hints”? 
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Human Auditory Localization 

• At low frequencies (up to about 800 Hz) works by 
Interaural Time Differences (ITDs) 

• At middle frequencies (800 Hz to 5 kHz) works by 
Interaural Level Differences (ILDs) 

• Transition is fairly sharp 

• due to the ITDs becoming ambiguous once the wavelength 
become smaller than ear spacing. 

• 2-channel stereo doesn’t get it right 

• ILD cues are such that the images tend to stick to nearest speaker 

• Ambisonics was designed from the beginning to get this 
correct with modest resources. 

• Small number of program channels and loudspeakers 
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Gerzon’s Theory of Auditory Localization 

• Early workers in stereo did theoretical analysis showing 
how stereo did (or didn’t) provide proper localization cues 

• Gerzon’s contribution was to integrate those theories and 
came up with a theory that defined 

• Ty, the vector sum of the signals from the loudspeakers 

• Tg, the vector sum of the squares of the signals from the 
loudspeakers. 

• By providing a simple mathematical encapsulation, we 
can use these to 

• design decoders 

• prove theorems, e.g., poiygonai decoder theorem 

• heip understand what various spatial sound reproduction systems 
can and cannot do 
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Localization Vector Theory 

• Ty predicts low-frequency localization almost perfectly. 

• If ry=1, then low-frequency sounds will be precisely located. 

• predicts mid-frequency localization moderately well. 

• If r^=^, then mid-frequency localization will be good 

• BUT... is always less than1, unless the sound is coming from a 
single point source. 

• At best r^ = cos(0/2), where 0 is the angle between the 
loudspeakers, so for a square array < 0.707. 

• In general, r^is low In directions with few loudspeakers 

• Best we can do Is have it change smoothly in performance from 
dense areas to sparse areas. 
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Energy Localization Vector 

• Maximizing and getting it to point in the right direction is 
the crux of the decoder design problem. 

• Easy with regular arrays 

• Irregular arrays always involve tradeoffs 

• Virtually all real world arrays are irregular! 

• Arrays need to fit in real rooms 

• ITU 5.1 is the dominant domestic standard, rear speakers 120° apart. 

• Because it is a non-linear function of speaker position, we 
currently need to use numerical optimization methods. 
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